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I.   INTRODUCTION 

An  ice  cream  factory  is  considering  two  different  brands  of 
dispensers  to  fill  their  cartons.   Both  brands  can  be  adjusted  to 
the  desired  number  of  ounces,  and  this  amount  is  automatically 
dispensed  at  regular  intervals .   The  company  is  concerned  that  Brand 
S  (which  is  considerably  less  expensive  than  Brand  G)  will  not  be  as 
precise  as  Brand  G  in  the  amount  of  ice  cream  it  puts  into  the 
cartons.  Thus  they  are  interested  in  testing  the  variability  of  the 
two  brands  of  dispensers,  and  if  Brand  S  is  not  significantly  less 
precise  in  the  amounts  it  is  dispensing,  they  will  use  the  less 
expensive  brand.   A  more  formal  statement  of  their  problem  follows. 

Let  X^,  ...  i  X  and  Y Y  be  independent  random  samples 

from  continuous  c.d.f.'s  F(x)  and  G(y) ,  respectively.  Assume  these 

distributions  are  identical  except  for  scale.   Let  9     be  the  scale 

x 

parameter  of  F(x)  and  S     be  the  scale  parameter  of  G(y) ,  and  let 
9   -  sx/sy-      Th6  problem  we  consider  in  this  report  is  the  one  tailed 
test  H0:  0-1  vs.  Hl ■    8>1. 

The  usual  statistic  for  this  test  is  F  -  (S  /S  )  ,  where  S  is 

x  y 

the  sample  standard  deviation.   We  reject  the  null  hypothesis  if  F  > 

F(a,n-l,m-l) .   However,  the  F  test  supposes  F(x)  and  G(y)  to  be 

normal  c.d.f.'s  and  is  known  to  be  very  sensitive  to  departures  from 

this  assumption.   For  example,  Box  (1953)  discusses  the  problem  and 

cites  several  previous  references.   Wasserstein  (1987)  shows  through 


simulation  that  under  distributions  other  than  the  normal,  the  F 
test  does  not  even  retain  the  a   level  when  testing  at  the  null 
hypothesis.   He  discusses  several  alternative  tests  and  compares 
their  performance  under  various  conditions.   He  further  suggests  the 
use  of  permutation  tests  based  on  functions  of  robust  estimators 
such  as  trimmed  means.   In  this  study  we  will  investigate  the 
performance  of  such  tests  for  the  two  sample  scale  problem  presented 
above . 


II.   The  Problem  of  Interest 
A.  Trimmed  Means 

Let  x^  <  ...  <  x  be  an  ordered  sample  of  size  n  from  a 

population  with  distribution  function  F(x) .   The  a   percent  trimmed 
mean  is  defined  (Boyer  and  Kolson  (1983))  by 


m(oj)  - 


S        x.  +  (l+[na] -na)(x.   ,  , +x   ,   ,) 
i_[na]+2    1  [na]+l   n-Ina,]' 


n-  [noj]  -1 
n(l-2a) 

Hence  m(a)  is  the  average  of  the  sample  values  that  remain  after  a 
proportion  a  have  been  "trimmed"  from  each  end  of  the  sample.   The 
average  of  those  discarded  observations  (i.e.  the  "mean  of  the 
trimmings")  is  defined: 

[na] 


m  (a)  -   

2na 


2    (x  +  x     )  +  (na-[na])(x.     +x   .   . 
j_i      1    n-i+1  [na]+l  n-[na] 


We  note  that  commonly  used  estimators  can  be  thought  of  as  limiting 

forms  of  trimmed  means:  m(.5)  and  m  (0)  are  defined  respectively  to 

be  the  median  and  midrange ,  while  m(0)  -  mC(.5)  is  the  mean.   Each 
of  these  three  are  the  most  efficient  estimators  of  location  (in 
fact,  they  are  UMVUE's)  for  different  distributions,  namely  the 
midrange  for  the  uniform  distribution,  the  mean  for  the  normal,  and 
the  median  for  the  double  exponential. 

As  an  example,  let  z  -  (1,  2,  3,  5,  9)  be  the  sample  vector. 
The  twenty  percent  trimmed  mean,  m(.2),  is  the  average  of  the 
observations  that  remain  after  trimming  (,2)*(5)-l  observation  from 
each  end  of  the  sample,  so  m(.2)  -  (2+3+5)/3  -  10/3.    The  average 

of  those  two  trimmed  observations  is  mc(.2)  -  (l+9)/2  -  5. 

According  to  the  definition  of  mc(0)  (the  midrange)  and  because  our 

sample  size  is  five,  mc(0)  -  mC(.2)  -  5.   The  median  is  m(.5)  -  3, 
and  the  mean  of  this  sample  is  m(0)  -  (l+2+3+5+9)/5  -  4. 

Note  that  the  definition  allows  for  fractional  parts  of 
observations  to  be  used  if  no  is  not  an  integer.   For  example, 

m  (.3)  is  the  average  of  the  smallest  1.5  observations  (i.e.  1  and 

.5*2)  and  the  largest  1.5  observations  (i.e.  9  and  .5*5)  so  mc(.3)  - 
(1+1+9+2.5)/  2*5*. 3  -  13.5/3  -  4.5. 


B.  Test  Statistics 

Since,  as  previously  noted,  trimmed  means  efficiently  estimate 

location  in  various  distributions,  we  speculate  that  functions  of 

these  trimmed  means  might  be  efficent  estimators  of  scale.   Thus  in 

this  study,  we  estimate  the  scale  parameter  of  both  populations, 

then  use  a  test  statistic  which  is  the  ratio  of  those  two  estimates, 

as  in  the  F-test.   The  scale  estimators  can  be  defined  as  follows: 

Let  m(a)  denote  the  a  percent  trimmed  mean  of  a  sample  z.  ,  ...  ,  z  . 

1        n 

Subtract  m(a)  from  each  sample  value  and  square  those  deviations, 

yielding  w. ,  ...  ,  w  ,  say.   Then  find  the  same  a  percent  trimmed 

mean  of  the  w.'s.   The  square  root  of  this  trimmed  mean  is  our 

estimator  of  scale.   The  definition  follows  similarly  for  m°(a) ,  the 
a  percent  mean  of  trimmings.   It  is  readily  seen  that  these 
estimators  are  invariant  to  changes  in  location,  so  that  we  need  not 
even  assume  our  populations  are  identical  in  location. 

To  illustrate  our  method  of  estimating  scale,  again  let  the 
sample  vector  be  z  -  (1,  2,  3,  5 ,  9) .  We  will  calculate  estimates  of 
scale  based  on  all  five  trimmed  means  that  were  demonstrated  in  the 

previous  section.   We  determined  that  mC(0)  -  m°(.2)  -  5.   Let  v  be 

the  vector  of  deviations  from  5,  then  v  -  (-4,  -3,  -2,  0,  4),  and 

the  vector  of  squared  deviations  is  w  -  (0,  4,  9,  16,  16).   The 

twenty  percent  mean  of  trimmings  for  w  is  (0+16)/2  -  8,  so  the 

estimate  of  scale  based  on  m  (.2)  (and  mC(0))  is  J8  -   2.83. 


For  m(0)  -  4,  w  -  (1,  1,  4,  9,  25)  and  the  estimate  of  the  scale 


parameter  has  value  7(l+l+4+9+25)/5  -75-2.83.   Since  m(.5)  -  3, 
the  median  based  scale  estimate  is  Jk   -  2 ,  as  computed  from  w  -  (0, 
1,  4,  4,  36).   Finally,  m(.2)  -  10/3,  so  v  -  (-7/3,  -4/3,  -1/3,  5/3, 
17/3),  and  w  -  (1/9,  16/9,  25/9,  49/9,  289/9).   The  twenty  percent 
trimmed  mean  of  w  is  [ (16+25+49)/9]/3  -  10/3,  and  scale  is  estimated 
as  JlO/Z   -  1.83. 

If  we  use  m(0)  (i.e.  the  mean)  as  the  basis  for  estimating 

1   n 
scale,  then  our  estimator  is  the  square  root  of   -   2   (x   -  x) 

n  1-1   * 
which  is  the  usual  estimator  of  variance  (using  n  rather  than  n-1) . 
Hence  our  test  statistic  is  the  square  root  of  the  F  test  statistic. 
Using  m(.5),  scale  is  estimated  as  the  median  deviation  from  the 
median,  another  common  estimator,  and  the  midrange  type  estimator  is 
very  nearly  the  range  estimator  of  scale.   Thus,  certain  of  the 
tests  examined  in  this  report  closely  correspond  to  statistics 
currently  in  use. 

The  estimators  of  scale  employed  here  may  not  be  (in  fact,  they 

probably  are  not)  unbiased  estimators  of  I      or  J  .   However   0   is 

x     y  x 

an  unbiased  estimator  of  c»  ,  for  some  constant  c,  and  I      is 

X  y 

similary  an  unbiased  estimator  of  cS    .      Hence  the  ratio  I  /I      is  a 

y  x'  y 

reasonable  estimate  of  c6   /cS     -  0   /B     -6 
x'   y    x'  y 


C.  A  family  of  symmetric  distributions 

Prescott  (1978)  discusses  the  robustness  properties  of  trimmed 
means  and  means  of  trimmings  as  unbiased  estimators  of  the  location 
parameter  n   in  the  exponential  power  family  of  distributions  defined 
(Hogg  (1972))  by  the  density  function 

f(x)  -  (2  T(l  +  1/r))"1  e"|X"'i|       (-»  <  x  <  »  ,  r  >  1) 
The  distributions  in  this  family  are  symmetric  about  /i  with  variance 
rO/r)/r(l/r) .   If  we  let  -y  -  1/r  be  a  continuous  parameter  in  the 
interval  [0,1],  this  family  can  be  shown  to  contain  distributions 
which  range  from  the  uniform  (7-O)  through  short-tailed  symmetric 
distributions  to  the  normal  (7-I/2) ,  then  through  long-tailed 
symmetric  distributions  to  the  double  exponential  (7-I) .  This  family 
of  distributions  will  be  referred  to  throughout  the  remainder  of 
this  report  as  the  Prescott  family. 

D.  Adaptive  Estimation  and  Testing 

Prescott  (1978)  also  discusses  the  use  of  an  adaptive  scheme 
for  estimating  location  in  this  family.   Several  adaptive  statistics 
are  proposed  whereby  the  trimming  proportion  a   is  based  upon  a 
measure  of  nonnormality  or  tailweight.   In  particular,  Prescott 
(1978)  and  Boyer  and  Kolson  (1983)  have  shown  the  following  to  be 
the  preferred  estimator  for  small  sample  sizes  (n<50)  such  as  are 
used  in  this  study. 


m  (0.2) 

mc(0.3) 

m(0) 

m(0.2) 

m(0.3) 


Q  <  2.2 
2.2  <  Q  <  2.4 
2.4  <  Q  <  2.8 
2.8  <  Q  <  3.0 
3.0  <  Q 


The  choice  of  location  estimator  for  this  statistic  is  based  on  a 
measure  of  nonnormality  proposed  by  Hogg  (1974),  namely 

Q"  (V05)  "  L(0.05))  I   ("(0.5)  "  t(0.5)>' 

where   U    and  L.   are  the  average  of  the  largest  and  smallest  n0 

order  statistics,  respectively,  with  fractional  items  used  if  n/3  is 

not  an  integer.   The  choice  of  Q  over  other  measures  of  tailweight 
such  as  kurtosis  is  discussed  in  detail  by  Hogg  (1972,  1974)  and 
Prescott  (1978),  as  well  as  the  choice  of  the  5  and  50%  proportions. 
We  use  T  as  the  basis  for  an  adaptive  procedure  in  testing  for 
equality  of  scale.   The  failure  of  the  F  test  in  non-normal 
distributions  motivates  the  use  of  an  adaptive  procedure.   We  first 

estimate  non-normality  using  Q,  then  select  a  scale  estimator  based 

on  the  trimmed  means  specified  in  T.   If  Q  suggests  the  distribution 
is  normal,  we  estimate  scale  based  on  the  mean,  which  is  equivalent 
to  using  the  Permutation  F  Test  to  test  our  hypothesis.   Otherwise, 
we  use  a  trimmed  mean  or  mean  of  trimmings  as  the  basis  for 
estimating  scale. 


In  this  problem  we  have  two  samples  but  wish  to  use  the  same 
scale  estimator,  i.e.  the  same  trimming  proportion,  for  both 
samples.   Since  Q  is  invariant  to  changes  in  scale,  for  each 

particular  distribution  0  -  0  ,  so  0  should  be  approximately  equal 

in  value  to  Q  .   To  avoid  the  possibility  of  slight  variations  in 

the  two  estimates  causing  selection  of  different  trimming 

proportions,  we  let  Q  -  =(0  +  0  )  and  use  T  to  determine  the  amount 

of  trimming  to  be  used  in  both  samples.  We  then  estimate  scale  and 
form  our  test  statistic  in  the  manner  that  was  described  in  section 
B  of  chapter  II . 

E.  Permutation  Tests 

Since  the  distribution  of  the  test  statistics  used  in  this 
study  are  not  mathematically  tractable,  we  use  a  randomization 
procedure  to  perform  the  test  of  hypothesis.   Dwass  (1957)  gives  a 
more  rigorous  definition  of  permutation  tests  than  will  be  presented 
here.   Our  purpose  is  to  explain  the  procedure  in  this  context. 

Suppose  x^ x  and  y^ y  are  two  independent 

random  samples  from  continuous  distributions,  with 

z  -  (z, z  z z  .  )  -  (x, x  ,y, y  ) 

1      n  n+1      n+m      1      n  J 1  Jm 

being  the  combined  sample  of  size  N  -  n+m.   Let  u(z)  be  a  statistic 

based  on  z  and  let  t-u(z)  be  the  value  of  u(«)  for  the  observed  z. 


N' 
Consider  the  r  -  -7— ;  permutations  of  the  indices  of  z  which  divide 
n!m! 

z  into  two  subsamples.   The  set  u   .  .  .  ,u  comprises  the  permutation 

sampling  distribution  of  the  statistic  u(-)-  Note  we  make  no 
distributional  assumptions  about  u(«).  Now  compare  t  to  this 
sampling  distribution.   If  k  of  the  u.  are  as  extreme  or  more 

extreme  than  t,  then  the  observed  p-value  for  this  test  is  k/r. 

If  indeed  the  null  hypothesis  of  no  scale  differences  is  true, 
then  the  populations  are  identical.   In  that  circumstance,  we  can 
think  of  randomly  assigning  the  labels  X  and  Y  to  the  observations, 
or  equivalently ,  randomly  dividing  z  into  two  subsets.   The  observed 
statistic  t  is  thus,  under  H0 ,  a  randomly  chosen  element  from  the 

distribution  of  u(-),  the  set  of  all  possible  such  elements.   On  the 
average,  t  will  have  a  value  at  or  near  the  mean  of  u(-),  and  such  a 
value  is  unlikely  to  lead  to  a  conclusion  in  favor  of  an  alternative 
hypothesis.   It  is  important  to  note  that  this  test  is  conditional 
upon  the  data  itself.   However,  the  permutation  test  procedure  does 
have  an  overall  significance  level  a  (Randies  and  Wolfe  (1979)) 
regardless  of  the  underlying  distribution. 

While  the  permutation  test  is  intuitively  appealing,  there  is 
one  inherent  problem.   For  small  sample  sizes,  the  permutation  set 
is  relatively  short  and  easily  enumerable.   For  example,  if  n-m-3 , 
there  are  only  20  possible  permutations.   However,  for  n-in-10,  there 
are  184,756  possible  permutations  to  consider,  too  large  a  set  to 
evaluate  in  practice  (especially  in  a  study  involving  runs  of  1000 


replications  each!).   Thus,  a  subset  sampling  approach  first 
suggested  by  Dwass  (1957)  holds  considerable  merit.   We  randomly 
sample  500  out  of  the  set  of  all  permutations,  and  calculate  u(z) 
for  each  of  those  500.   If  20  of  the  u(z)  are  more  extreme  than  t, 
our  p-value  is  20/500  -  0.04,  which  is  an  estimate  of  the  actual 
significance  level  we  would  have  observed  by  evaluating  all  184,756 
permutations. 

To  determine  if  500  sampled  permutations  is  sufficient  to 
estimate  the  actual  significance  level  of  the  test,  we  examined  the 
power  of  four  of  our  tests  for  one  distribution  (the  double 
exponential)  at  six  sizes  of  permutation  subset  sampling.   We  were 
looking  for  stability  in  the  power  estimates;  if  500  samples  gave 
approximately  the  same  estimate  of  power  as  1500  samples,  then  there 
would  not  be  a  need  to  use  1500. 

Wasserstein  (1987)  showed  that  a  test  based  on  1600  samples  is 
highly  comparable  to  full  enumeration  for  this  same  problem.   We 
looked  at  subsets  of  100,  250,  500,  750,  1000  and  1500  permutations. 
At  the  null  hypothesis  (i.e.  6   -  8   /S     -   1)  there  is  virtually  no 

difference  in  either  the  .01  or  .05  rejection  rates  across  the 
different  sizes  of  subsets.   (See  Table  II. E,  which  is  based  on  500 
replications  of  the  simulation.)   At  6-2   and  0-4,  there  is  a 
substantial  power  difference  between  a  subset  of  100  and  the  other 
subsets,  but  once  the  subset  size  is  increased  to  250,  the  rejection 
rates  stabilize.   Thus  we  do  not  seem  to  gain  substantial  accuracy 
by  choosing  subsets  of  1500  or  even  1000  over  subsets  of  500. 
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TABLE   II. E      Comparison  of  Power   at  Different  Levels   of  Subsaropling 
.01  Rejection  Rates 
.05   Rejection  Rates 


mc(0) 

mC(.5) 

9-1 

9-2 

8-4 

9-1 

9-2 

9-4 

100 

.010 

.124 

.444 

.012 

.118 

.480 

.040 

.326 

.764 

.042 

.352 

.806 

250 

.010 

.142 

.528 

.010 

.140 

.528 

.040 

.346 

.790 

.040 

.360 

.846 

500 

.008 

.134 

.506 

.010 

.118 

.556 

.040 

.344 

.784 

.044 

.364 

.844 

750 

.010 

.144 

.530 

.010 

.138 

.564 

.042 

.348 

.786 

.044 

.376 

.846 

1000 

.008 

.132 

.514 

.010 

.126 

.566 

.044 

.344 

.780 

.044 

.366 

.840 

1500 

.010 

.140 

.502 

.010 

.120 

.558 

.044 

.344 

.782 

.044 

.374 

.836 

m(.5) 

adaptive 

0-1 

9-2 

0-4 

9-1 

9-2 

9-4 

100 

.004 

.058 

.302 

.014 

.116 

.484 

.058 

.252 

.670 

.052 

.354 

.820 

250 

.010 

.086 

.388 

.010 

.158 

.594 

.054 

.264 

.690 

.044 

.376 

.860 

500 

.010 

.064 

.354 

.012 

.140 

.562 

.044 

.260 

.684 

.048 

.368 

.852 

750 

.008 

.078 

.374 

.012 

.150 

.572 

.052 

.268 

.690 

.046 

.378 

.850 

1000 

.008 

.068 

.362 

.010 

.142 

.566 

.052 

.264 

.696 

.048 

.368 

.846 

1500 

.008 

.076 

.348 

.010 

.142 

.566 

.054 

.253 

.698 

.050 

.374 

.848 
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III.   A  Simulation  Study 

A.  Scope  of  the  Simulation 

We  compare  by  simulation  the  power  of  eight  randomization 
tests,  each  based  on  robust  estimators  of  scale.   These  eight  tests 
will  be  referred  to  according  to  the  trimmed  mean  or  mean  of 
trimmings  used  in  estimating  the  scale  parameter.   One  of  these 
tests  uses  the  adaptive  estimation  statistic  T  described  in  section 
D  of  chapter  II.   The  other  seven  use  fixed  levels  of  a    (the 
trimming  proportion) .   Five  of  these  comprise  the  adaptive 
statistic;  the  median  and  midrange  are  also  used.   Hence  the  eight 
statistics  are  based  on  functions  of  the  following  trimmed  means: 

1)  m  (0)  --  the  midrange 

2)  mc(0.2) 

3)  mc(0.3) 

4)  m  (0.5)  -  m(0)  --  the  mean 

5)  m(0.2) 

6)  m(0.3) 

7)  m(0.5)  --  the  median 

8)  the  adaptive  statistic,  which  uses  one  of  2)  through  6) 

based  on  the  observed  value  of  the  statistic  Q. 
The  tests  were  compared  under  several  symmetric  distributions, 
with  sample  sizes  of  10  and  10.   Five  values  of  7  were  chosen  to 
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represent  the  exponential  power  family  of  distributions  defined  in 
section  II. C  :  7-0  (the  uniform  distribution);  7-0.25;  7-O.5  (the 
normal);  7-0.75;  and  7-I.O  (the  double  exponential).   We  also  used 
the  Cauchy  and  10%  Mixed  Normal,  which  consists  of  90%  N(0,1) 
contaminated  with  10%  N(0,64).   These  two  distributions  were  used  by 
Wasserstein  (1987),  and  we  also  used  them  because  his  work  on  the 
same  problem  prompted  this  study.   In  addition,  these  distributions 
tend  to  have  heavier  tails  than  any  of  the  members  of  the  Prescott 
family. 

Let  fi      and  S      be,  respectively,  the  location  and  scale 

parameters  of  population  1,  and  let  u   and  d      be  the  location  and 

y    y 

scale  parameters  of  population  2.   In  the  simulation,  u  -  u  -  0 

x    y 

which  causes  no  loss  of  generality  since  all  the  tests  are  location 

invariant.   Let  S   -  S     /  S    .      Four  values  of  $   are  considered  in 
y    x 

each  distribution  to  provide  a  wide  range  of  power  estimates.   The 
results  appear  in  Appendix  2. 

B.  Description  of  the  Simulation  Program 

This  simulation  was  actually  executed  in  two  parts.   Part  one 
consisted  of  generating  the  sample  values  through  IMSL  subroutines 
on  an  NAS  6630  (National  Advanced  System)  mainframe.   The  remainder 
of  the  simulation  was  also  written  In  Fortran  but  implemented  on  a 
Harris  700  computer.   Both  programs  are  listed  in  Appendix  1. 
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The  required  input  for  the  sample  generation  program  is  as 
follows:  number  of  replications,  sample  sizes  (n,m),  the  value  of  7 
(To  generate  from  the  Cauchy,  set  7-1.25,  for  the  Mixed  Normal,  set 
7-1-50.   This  is  for  convenience  only,  and  is  not  meant  to  imply 
that  these  distributions  belong  to  the  Prescott  family.),  the  values 
°f  *x  and  >     and  the  seeds  for  the  random  number  generators.   These 
values  and  the  sample  data  are  then  output  to  a  file  which  is  used 
as  input  for  the  second  part  of  the  simulation.   The  Prescott  family 
can  be  derived  via  a  power  transformation  from  the  gamma 
distribution  with  scale  parameter  1  and  shape  parameter  7,  and  this 
method  was  used  to  generate  these  distributions. 

The  simulation  program  consists  of  four  main  parts,  which  are 
discussed  here  in  some  detail. 

1)  Input  all  parameters  associated  with  sample  generation, 
along  with  a  seed  for  the  random  number  generator  in  the  permutation 
test.   Set  all  arrays  to  zero. 

2)  Input  the  two  samples,  which  are  then  combined  and  sorted 
(for  use  in  the  permutation  test).   Calculate  each  of  the  test 
statistics  based  on  the  original  data.   For  the  adaptive  statistic, 

only  Q  and  the  interval  in  which  Q  falls  is  calculated,  since  T  will 
always  use  one  of  the  statistics  previously  calculated. 
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3)  Run  the  approximate  permutation  test  by  sampling  500  out  of 
the  entire  set  of  permutations,  without  replacement.   Calculate  each 
test  statistic  and  compare  the  permutation  value  to  the  original 
value  for  each  statistic.   Calculate  an  approximate  p- value  as 
e/500,  where  e  is  the  number  of  permutation  statistic  values  more 
extreme  than  the  original.   To  mimimize  the  run  time  of  the 
simulation,  whenever  e  exceeds  25  (5%  of  500)  for  a  particular 
statistic,  discontinue  calculation  of  that  statistic.   If  e  is 
greater  than  25  for  all  statistics,  then  exit  the  permutation  test. 

The  500  permutation  samples  are  generated  in  the  following  way. 
Let  N-n+m.   A  set  of  n  random  integers  between  1  and  N  are  randomly 
selected  without  replacement,  representing  the  indices  of  the  items 
in  the  combined  sample  to  be  assigned  to  the  first  sample,  with  the 
remaining  items  assigned  to  the  second  sample.   The  statistics  are 
then  calculated  from  these  two  samples . 

4)  Note  which  tests  are  significant  at  the  a-. 05  level.  Repeat 
steps  2  and  3  as  desired  (1000  times  in  this  study).  Calculate  .05 
rejection  rates,  the  average  number  of  permutations  sampled  and  the 

mean  and  variance  of  Q. 

Figure  III.B  gives  a  partial  list  of  the  subroutines  used  in 
the  simulation  program. 
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FIGURE  III.B  List  of  Subroutines  Used  in  the  Simulation 


BPERM         Executes  the  permutation  test 

DEVSQ         Calculates  two  vectors  of  squared  deviations 
around  corresponding  location  estimates 

MEAN  Calculate  the  sample  mean, 

MEDIAN  median  and  midrange, 

MIDRAN         for  each  of  two  samples. 

QHAT  Calculates  an  estimate  of  Q,  Hogg's 

nonnormality  indicator 

QINT  Determines  the  interval  in  which  Q  is  observed 

by  which  a      (the  trimming  proportion)  is 
adaptively  chosen 

SAMPER        Chooses  the  permutation  sample  from  the  set 
of  all  possible  permutations 

SHELL        Performs  a  shell  sort 

TCMEAN        Calculates  the  a  mean  of  trimmings 

TMEAN         Calculates  the  a   timmed  mean 

TMNSCL  Calculates  estimates  of  scale  based  on  the 
trimmed  mean  (similar  for  TCMNSC,  MNSCAL, 
MEDSCL,  and  MIDSCL) 


C   Results  of  the  Simulation  Study 

The  simulation  results  are  presented  in  three  sections.   In  the 
first,  we  compare  the  power  of  the  eight  tests  under  the  various 

distributions.   The  second  section  examines  the  performance  of  Q  as 
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an  estimator  of  Q.   In  the  third  section,  we  discuss  a  time  saving 
method  of  performing  the  permutation  test. 

1.   Power  Comparisons 

The  reader  should  refer  to  Tables  A-l  through  A-8  and  Figures 
B-l  through  B-8  in  Appendix  2.   The  findings  can  be  summarized  as 
follows . 

1)  The  means  of  trimmings  (  m°(0) ,  m°(.2),  mC(.3)  )  perform 
better  than  either  the  20%  or  30%  trimmed  means  for  the  short-  to 
medium-  tailed  distributions,  but  the  opposite  is  true  for  the  long 
tailed  Cauchy  and  10%  Mixed  Normal,  where  the  trimmed  means  perform 
far  better.   In  fact,  for  the  10%  Mixed  Normal,  the  tests  based  on 
the  20%  and  30%  trimmed  means  are  the  most  powerful  tests.   They 
outperform  any  of  the  "standard"  tests  (those  based  on  the  midrange, 
mean  and  median)  and  the  adaptive  test.   This  was  the  only 
distribution  where  one  of  those  four  was  not  the  most  powerful. 

2)  The  mean  test  performs  well  for  all  except  the  Cauchy  and 
Mixed  Normal,  but  even  for  those  distributions  its  power  is  greater 
than  the  other  means  of  trimmings.   Also  the  test  performs  better 
than  might  be  expected  for  the  Double  Exponential. 

3)  The  median  test  did  not  perform  well  at  all  except  for  the 
Cauchy  and  Mixed  Normal;  even  there  it  was  not  the  most  powerful 
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test.   The  median  test  does  not  perform  well  even  for  the  Double 
Exponential,  where  we  might  expect  that  it  would. 

4)  The  adaptive  estimation  test  consistently  performs  well, 
especially  for  the  heavy-tailed  distributions.   It  is  always  in  the 
top  group  of  tests  in  terms  of  power.   No  other  statistic  is  so 
consistent. 

Thus  while  the  adaptive  statistic  does  not  always  yield  the 
single  most  powerful  test,  under  no  distribution  is  any  other  test 
clearly  more  powerful  than  the  adaptive.   In  fact,  no  test  is  the 
overwhelming  favorite  for  any  distribution. 

2.  Performance  of  0 

We  calculated  average  values  of  Q  (with  standard  errors)  for 
each  run  of  the  simulation.   These  results  are  presented  for  the 
four  values  of  S   examined  in  each  distribution,  along  with  the  true 
population  value  of  Q.   As  can  be  seen  in  Table  III.C  below,  the 

statistic  Q  is  invariant  to  changes  in  scale,  but,  as  noted  by  Boyer 
and  Kolson  (1983),  tends  to  underestimate  the  population  value  Q. 

For  the  Uniform  distribution,  this  error  is  not  substantial  (Q 
averages  1.85  when  Q  -  1.90)  but  as  the  tailweight  of  the  population 
increases,  the  degree  of  under-estimation  becomes  more  severe. 
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TABLE  III.C   Observed  Values  of  Q  Compared  with  Population  Values 

Average  Values  of  Q 
Standard  error  of  estimate 


Q 

»1 

Uniform       1 . 90 

1.842 

.213 

Prescott(.25)  2.20 

1.952 

.226 

Normal        2.58 

2.109 

.265 

Prescott(.75)  2.95 

2.240 

.290 

Double  Exp    3.30 

2.363 

.323 

Mixed  Normal  4.95 

2.677 

.521 

Cauchy       10 .  00 

3.095 

.579 

92 


1.851  1.848  1.847 

.210  .207  .203 

1.963  1.969  1.955 

.231  .224  .221 

2.096  2.111  2.116 

.258  .260  .265 

2.262  2.266  2.251 

.298  .282  .286 

2.392  2.371  2.392 

.310  .308  .330 

2.656  2.680  2.690 

.521  .510  .503 

3.102  3.114  3.132 

.594  .596  .594 


For  example,  in  the  case  of  the  Double  Exponential,  the  average  Q  is 
2.38  for  a  population  value  of  Q  -  3.30;  Q  -  10.0  for  the  Cauchy  but 

the  average  Q  is  3.11.   At  the  completion  of  this  project  we 

discovered  that  when  n-10  the  numerator  of  Q  actually  estimates  the 

upper  and  lower  10%  rather  than  5%  of  the  distribution,  so  that  the 
population  values  of  Q  for  this  special  case  are  smaller  than  the 
general  values  which  appear  in  the  table  above.   For  example,  at 
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n-10  the  population  values  of  Q  are  5  for  the  Cauchy,  and  3.4  for 
the  10%  Mixed  Normal.   Hence  the  values  of  Q  which  we  observed  do 
not  show  such  marked  underestimation.   The  fact  that  our  adaptive 
procedure  displayed  such  consistently  high  power  even  under  these 
conditions  suggests  that  only  crude  estimates  of  tailweight  are 
necessary  for  this  test  to  perform  well. 

3,  A  Permutation  Test  Short-Cut 

In  this  simulation,  we  were  only  interested  in  .05  rejection 
rates.   Thus,  for  any  given  replication,  if  rejection  at  the  .05 
level  became  impossible  (because  more  than  25  of  the  permutation 
values  were  more  extreme  than  the  original  value)  the  test  was 
terminated.   For  runs  of  the  simulation  at  the  null  hypothesis  (i.e. 
0  ■  ^„/^v  "  1)  an  average  of  only  150  (approximately)  sampled 

permutations  were  necessary.   For  the  cases  of  the  most  extreme 
departures  from  the  null  hypothesis  which  we  examined,  an  average  of 
493  permutation  were  required.   This  disparity  resulted  in  a  ratio 
of  almost  5  to  1  in  CPU  minutes  required  to  complete  the  simulation 
(a  maximum  of  635  CPU  minutes  to  a  minimum  of  130),  a  substantial 
time  savings.   Thus  in  an  actual  application  of  the  permutation 
test,  one  might  wish  to  consider  500  to  1000  samples  of  the  set  of 
permutations,  but  only  continue  evaluation  of  the  statistic  u(«) 
while  H0  can  still  be  rejected  at  the  desired  level  of  a. 
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IV.   Conclusion 

We  have  seen  that,  in  general,  randomization  tests  based  on 
functions  of  trimmed  means  perform  well  for  the  two  sample  scale 
problem.   In  particular,  the  test  based  on  the  mean  (which  is  the 
permutation  F  test)  is  quite  powerful  for  all  except  the  heaviest 
tailed  distributions.   The  adaptive  test  is  by  far  the  most 
consistent  of  the  tests  we  have  examined  here.   Based  on  this 
finding  we  recommend  the  use  of  the  adaptive  test  for  this  problem. 
We  also  recommend  the  permutation  test  shortcut  discussed  in  section 
III.C.3.   Continued  research  in  this  area  could  examine  the  power  of 
this  adaptive  procedure  for  sample  sizes  other  than  10  and  10,  and 
consideration  of  the  problems  posed  by  unequal  sample  sizes.   We 
believe  the  adaptive  statistic  will  continue  to  display  the 
desirability  it  has  shown  here. 
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APPENDIX  1 
Source  Listing  of  Simulation  Program 

C  GENERATION  PROGRAM 

C    PURPOSE 

C      GENERATES  THE  SAMPLES  FROM  VARIOUS  DISTRIBUTIONS 

C      FOR  THE  SIMULATION 

C 

C 

C  DEFINE  VARIABLE  NAMES 

C 

C  ID  -   INDICATES  SAMPLING  DISTRIBUTION 

C  SAMPL  1,2    -   REAL*8  ARRAY  OF  SAMPLE  VALUES  FROM  POP'N  1,2 

C  N,M         -   SAMPLE  SIZES 

C  NREPS       -   NUMBER  OF  INDEPENDENT  REPLICATIONS  DESIRED 

C  GAMMA       -   PARAMETER  OF  THE  PRESCOTT  FAMILY 

C  THETA  1,2    -   ACTUAL  SCALE  PARAMETERS  OF  POPULATION  1,2 

C  MT  1,2      -   ADDITIONAL  SCALE  PARMS  FOR  MIXED  NORMAL  DIST'N 

C  IX,JX,KX,LX  -   SEEDS  FOR  THE  RANDOM  NUMBER  GENERATORS 

C  DIX,DJX,DKX,DLX   -   DOUBLE  PRECISION  VAR'S  WITH  SEEDS  VALUES  FOR  RNG 

C 

PROGRAM  GEN 

REAL*8  SAMPL1(10),SAMPL2(10),R,A,B,T,DIX,DJX,DKX,DLX,PI 
C 

REALM  GAMMA, X(10), Y(10) ,WK(50) , BETA1 ,BETA2 .THETA1 ,THETA2 ,MT1  MT2 
C 

INTEGER*4  NREPS , IX, JX.KX, LX,N,M, ID 
C 

CHARACTER*15  IDENT 
C 

COMMON/RNG/DIX ,  DJX ,  DKX ,  DLX 
C 

DATA  NREPS  ,N,M, GAMMA/1000  ,  10  ,  10  ,  0  .  00/ 

DATA  THETA1 , THETA2 , MT1 , MT2/1 . , 1 . , 0 . , 0 . / 
C 

READ(5,240)  IX,JX,KX,LX 

WRITE(6,240)  IX,JX,KX,LX 
DIX-IX 
DJX-JX 
DKX-KX 
DLX-LX 
C 

C   GENERATE  THE  SAMPLES 
C 

DO  170  J-l, NREPS 

ID  -  4.*GAMMA  +  1 

GOTO  (100, 110, 110, 110, 110, 120, 130), ID 
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100  CALL  UNIF0R(SAMPL1,SAMPL2,N,M,THETA1,THETA2) 
IDENT  -  'UNIFORM' 
GOTO  150 
C 

110  CALL  PRESCT(SAMPL1,SAMPL2,N,M,THETA1,THETA2, GAMMA) 

GOTO  (111, 112, 113, 114, 115), ID 

111  GOTO  150 

112  IDENT  -  'PRESCOTT(.25)' 
GOTO  150 

113  IDENT  -  'NORMAL' 
GOTO  150 

114  IDENT  -  'PRESC0TT(.75)' 
GOTO  150 

115  IDENT  -  'DOUBLE  EXPON' 
GOTO  150 

C 

120  CALL  CAUCHY(SAMPL1,SAMPL2,N,M,THETA1,THETA2) 
IDENT  -  'CAUCHY' 
GOTO  150 
C 

130  CALL  MIXED ( S AMPL1 , SAMPL2 , N , M , THETA1 , THETA2 , MT1 , MT2 ) 
IF  (J  .GT.  1)   GOTO  140 
IDENT  -  'MIXED  NORM' 
WRITE(6,200)  IDENT, NREPS 
WRITE(6,220)  N ,  M ,  THETA1 ,  THETA2  ,MT1,MT2 
140  WRITE(6,230)  (SAMPLl(I) ,1-1, N) 
WRITE(6,230)  (SAMPL2(I),I-1,M) 
GOTO  170 
C 

150  IF  (J  .GT.  1)   GOTO  160 
WRITE (6, 200)  IDENT, NREPS 
WRITE(6,210)  N,M,THETA1,THETA2 
160  WRITE(6,230)  (SAMPLl(I) , I-l.N) 
WRITE(6,230)  (SAMPL2(I),I-1,M) 
C 

170  CONTINUE 
C 

STOP 

c 

C      DEFINE  OUTPUT  FORMATS 
C 

200  F0RMAT(1X,A15,I5) 

210  FORMAT(2I5,2F10.5) 

220  FORMAT ( 215, 4F10. 5) 

230  FORMAT(10F8.4) 

240  FORMAT(4I10) 


END 
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C 

C   SUBROUTINE  CAUCHY 

C    PURPOSE 

C       GENERATES  TWO  SAMPLES  OF  SIZES  N  AND  M,  RESPECTIVELY,  FROM 

C      THE  CAUCHY  DISTRIBUTION  WITH  LOCATION  PARAMETER  ZERO  AND  SCALE 

C      PARAMETERS  BETA1  AND  BETA2 ,  RESPECTIVELY.   USES  THE  PROBABILITY 

C      INTEGRAL  TRANSFORM  TECHNIQUE  TO  GENERATE  CAUCHY  DEVIATES  FROM 

C      UNIFORM  DEVIATES. 

C 

C    USAGE 

C      CALL  CAUCHY ( S AMPL1 , SAMPL2 , N , M , BETA1 , BETA2 ) 

C 

C    SUBROUTINES  CALLED 

C      GGUBFS 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1   -   REAL*8  ARRAY  OF  LENGTH  N  CONTAINING  THE  SAMPLE  VALUES 

C  FROM  POPN  1 

C      SAMPL2   -   REAL*8  ARRAY  OF  LENGTH  M  CONTAINING  THE  SAMPLE  VALUES 

C  FROM  POPN  2 

C      N,M     -   SAMPLE  SIZES 

C      BETA1    -   SCALE  PARAMETER  OF  POPN  1 

C      BETA2    -   SCALE  PARAMETER  OF  POPN  2 

C 

C 

SUBROUTINE  CAUCHY( SAMPL1 , SAMPL2 , N , M , BETA1 , BETA2 ) 

INTEGER*4  N,M 

REAL*8  SAMPL1(N),SAMPL2(M),DIX,DJX,DKX,DLX,PI 

REAL*4  BETA1,BETA2,A,B 

COMMON/RNG/DIX , DJX , DKX , DLX 

DATA  PI/3.141592654/ 
C 

DO  100  I-l.N 
A  -  GGUBFS (DIX) 
100   SAMPLl(I)  -  BETA1  *  TAN(PI*(A- . 5) ) 
C 

DO  110   I-l.M 
B  -  GGUBFS (DJX) 
110   SAMPL2(I)  -  BETA2  *  TAN(PI*(B- . 5) ) 


RETURN 
END 
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C*********************************************************************** 

C  SUBROUTINE  MIXED 

C  PURPOSE 

C  GENERATES  TWO  SAMPLES  OF  SIZES  N  AND  M,  RESPECTIVELY,  FROM 

C  A  10%  MIXED  NORMAL  WITH  SCALE  PARAMETERS  THETA1  AND  THETA2 

C  FOR  90%  OF  THE  SAMPLE,  AND  MIXING  SCALE  PARAMETERS  MT1  AND 

C  MT2  FOR  THE  REMAINING  10  %.   (THE  SCALE  PARAMETERS  ARE 

C  STANDARD  DEVIATIONS) 

C 

C  USAGE 


CALL  MIXED  (  SAMPL1 ,  SAMPL2  ,  N ,  M ,  THETA1 ,  THETA2  ,  MT1 ,  MT2 ) 


C 

C 

C    SUBROUTINES/FUNCTIONS  CALLED 

C      GGNPM,  GGUBFS 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1.2  -   REAL*8  ARRAY  OF  LENGTH  N  CONTAINING  THE  SAMPLE  VALUES 

C  FROM  POP'N  1,2 

C      N,M      -   SAMPLE  SIZES 

C      THETA1.2  -   STANDARD  DEVIATION  OF  POPN  1,2 

C      MT1.2     -   STANDARD  DEVIATION  OF  THE  MIXING  POPULATIONS 

C 

C    METHOD 

C      CALLS  SUBROUTINE  GGNPM  TO  OBTAIN  THE  N(0,1)  RANDOM  DEVIATES, 

C      THEN  ADJUSTS  THEM  TO  HAVE  CORRECT  VARIANCE 

C 

SUBROUTINE  MIXED (SAMPL1 , SAMPL2 , N , M , THETA1 , THETA2 , MT1 , MT2 ) 
REAL*8  SAMPL1 (N) , SAMPL2 (M) , DIX , DJX , DKX , DLX 
REAL*4  X(10) ,Y(10) ,THETA1,THETA2,MT1,MT2,T,R 
INTEGER*4  N,M 

COMMON/RNG/DIX ,  DJX ,  DKX ,  DLX 
C 

CALL  GGNPM(DIX,N,X) 
C 

DO  100   I-l.N 

T-THETA1 

R-GGUBFS(DKX) 

IF(R  .LT.  .10)T-MT1 

SAMPL1(I)-X(I)*T 
100   CONTINUE 
C 

CALL  GGNPM  (DJX.M.Y) 
C 

DO  110  I-l.M 

T-THETA2 

R-GGUBFS (DLX) 

IF(R  .LT,  .10)T-MT2 

SAMPL2(I)-Y(I)*T 
110   CONTINUE 
C 

RETURN 

END 
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c 

C   SUBROUTINE  PRESCT 

C 

C    PURPOSE 

C      GENERATES  TWO  SAMPLES  OF  SIZE  N  AND  M,  RESPECTIVELY,  FROM 

C      THE  PRESCOTT  FAMILY  OF  SYMMETRIC  DISTRIBUTIONS  DEFINED  BY 

C      GAMMA  IN  THE  INTERVAL  (0,1),  HAVING  SCALE  PARAMETERS  BETA1 

C      AND  BETA  2 

C 

C    USAGE 

C      CALL  PRESCT ( SAMPL1 , SAMPL2 , N , M , THETA1 , THETA2 , GAMMA) 

C 

C    SUBROUTINES  CALLED 

C       GGAMR,  GGUBFS 

C 

C    DESCRIPTION  OF  PARAMTERS 

C      SAMPL1   -   REAL*8  ARRAY  OF  LENGTH  N  CONTAINING  THE  SAMPLE 

C  VALUES  FROM  POPULATION  1 

C      SAMPL2   -   REAL*8  ARRAY  OF  LENGTH  M  CONTAINING  THE  SAMPLE 

C  VALUES  FROM  POPULATION  2 

C      BETA1.2  -   SCALE  PARAMETER  OF  POPULATION  1,2 

C      GAMMA    -   PRESCOTT  FAMILY  PARAMETER 

C 

C    METHOD 

C      CALL  SUBROUTINE  GGAMR  TO  OBTAIN  GAMMA(GAMMA, 1)  DEVIATES, 

C      MAKES  A  POWER  TRANSFORMATION  TO  THE  APPOPRIATE  PRESCOTT 

C      DISTRIBUTION,  AND  ADJUSTS  TO  THE  CORRECT  SCALE 

C 

C 

SUBROUTINE  PRESCT  (  SAMPL1 ,  SAMPL2  ,  N ,  M ,  BETA1 ,  BETA2  ,  GAMMA) 

REAL*8  SAMPL1(N),SAMPL2(M),R,DIX,DJX,DKX,DLX 

REAL*4  GAMMA, X(10) ,Y(10) ,WK(20) ,BETA1,BETA2 

INTEGER*4  N,M 

COMMON/RNG/DIX , DJX , DKX , DLX 
C 

CALL  GGAMR(DIX, GAMMA, N,WK,X) 
C 

DO  100   I-l.N 
SAMPLl(I)  -  (X(I)  **  GAMMA)  *  BETA1 
R  -  GGUBFS (DKX) 
100   IF  (R  .LT.  0.5)   SAMPLl(I)  -  -1  *  SAMPLl(I) 
C 

CALL  GGAMR(DJX, GAMMA, M,WK,Y) 
C 

DO  110   I-l.M 
SAMPL2(I)  -  (Y(I)  **  GAMMA)  *  BETA2 
R  -  GGUBFS (DLX) 
110   IF  (R  .LT.  0.5)   SAMPL2(I)  -  -1  *  SAMPL2(I) 


RETURN 
END 
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c 

C  SUBROUTINE  UNIFOR 

C  PURPOSE 

C  GENERATES  TWO  SAMPLES  OF  SIZES  N  AND  M  FROM  U( -THETA1 ,THETA1) 

C  AND  U(-THETA2,THETA2) ,  RESPECTIVELY. 

C 

C  USAGE 

C  CALL  UNIF0R(SAMPL1,SAMPL2,N,M,THETAI,THETA2) 

C 

C  FUNCTION  CALLED 

C  GGUBFS 

C 

C  DESCRIPTION  OF  PARAMETERS 

C  SAMPL1   -   REAL*8  ARRAY  OF  LENGTH  N  CONTAINING  THE  SAMPLE  VALUES 

C  FROM  POPN  1 

C  SAMPL2   -   REAL*8  ARRAY  OF  LENGTH  M  CONTAINING  THE  SAMPLE  VALUES 

C  FROM  POPN  1 

C  N,M     -   SAMPLE  SIZES 

C  THETA1   -   SCALE  PARAMETER  OF  POPN  1 

C  THETA2   -   SCALE  PARAMETER  OF  POPN  2 

C 

C  METHOD 

C  INVOKES  THE  PRIME  UNIFORM  RANDOM  NUMBER  GENERATOR 

C 

SUBROUTINE  UNIFOR ( SAMPL1 , SAMPL2 , N , M , BETA1 , BETA2 ) 
REAL*8  SAMPL1 ( 10 ) , SAMPL2 ( 10 ) , DIX , DJX , DKX , DLX 
REAL*4  BETA1,BETA2,A,B 
INTEGER*4  N,M 

COMMON/RNG/DIX , DJX , DKX , DLX 
C 

DO  100  I-l.N 
99     A-GGUBFS(DIX) 

IF(A  .LT.  0.000000001)GOTO  99 

SAMPL1 (I)-(A-.5)*2. *BETA1 

100  CONTINUE 
C 

DO  110  I-l.M 

101  B-GGUBFS(DJX) 

IF(B  .LT.  0.000000O01)GOTO  101 

SAMPL2(I)-(B-.5)*2.*BETA2 
110   CONTINUE 
RETURN 
END 
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c 

C  SIMULATION  PROGRAM 

C     PURPOSE 

C      COMPARE  SIMILAR  MEASURES  OF  SCALE  BASED  ON  TRIMMED  MEANS 

C      FOR  THE  PRESCOTT  FAMILY  OF  SYMMETRIC  DISTRIBUTIONS,  CAUCHY, 

C      AND  MIXED  NORMAL  DISTRIBUTIONS 

C 

C  VARIABLE  DEFINITIONS 

C 

C  SAMPL  1,2  -  ARRAY  OF  SAMPLE  VALUES  FROM  POPULATION  1,2 

C  PSAMP  1,2  -  ARRAY  OF  SAMPLE  VALUES  AS  ASSIGNED  IN  THE 

C  PERMUTATION  TEST 

C  SQDEV  1,2  -  ARRAY  OF  SQUARED  DEVIATIONS  (SEE  SUB.  DEVSQ) 

C  COMB  -  ARRAY  OF  COMBINED  SAMPLE  VALUES 

C  OSTAT  -  VALUES  OF  THE  TEST  STATISTICS  EVALUATED  ON 

C  THE  ORIGINAL  SAMPLE  DATA 

C  PSTAT  -  VALUES  OF  THE  TEST  STATISTICS  EVALUATED  ON 

C  THE  PERMUTED  SAMPLE  DATA 

C  EXTREM  -  ACCUMULATOR  W/IN  PERM.  TEST  OF  EXTREM  OBS . 

C  REJECT  -  COUNTS  REPS  WHICH  YIELDED  SIGNIFICANT  PERM.  TESTS 

C  REJPER  -  PERCENT  REJECTIONS  FOR  EACH  STATISTIC 

C  CONTIN  -  DETERMINES  CONTINUATION  OF  PERMUTATION  LOOP  FOR 

C  INDIVIDUAL  STATISTICS 

C  ALL  -  DETERMINES  POINT  OF  TERMINATION  OF  PERM.  LOOP 

C  ODD  -  NOTES  EVEN  OR  ODD  SAMPLE  SIZE  FOR  SUB.  MEDI 

C  N,M  -  SAMPLE  SIZES 

C  NREPS  -  NUMBER  OF  INDEPENDENT  REPLICATIONS  DESIRED 

C  NPERM  -  NUMBER  OF  PERMUTATIONS  TO  BE  SAMPLED 

C  NSTAT  -  NUMBER  OF  STATISTICS  TO  BE  TESTED 

C  CVAL  -  CRITICAL  VALUE  OF  EXTREM  OBS.  AT  P-.05 

C  ALPHA  -  DESIRED  AMOUNT  OF  TRIMMING 

C  LOC  1,2  -  LOCATION  ESTIMATOR  FOR  SAMPLE  1,2 

C  SCALE  1,2  -  SCALE  ESTIMATOR  FOR  SAMPLE  1,2 

C  Q  -  NONNORMALITY  INDICATOR  USED  IN  THE  ADAPTIVE  SCHEME 

C  INT  -  INDICATES  THE  INTERVAL  (2,6)  IN  WHICH  Q  IS  OBSERVED 

c  ICT  -  VECTOR  COUNTING  THE  TIMES  Q  WAS  PLACED  IN  EACH  INTERVAL 

C  QSUM  -  SUMS  THE  VALUES  OF  Q  (  FOR  MEAN  Q) 

c  QSQ  -  SUMS  THE  SQUARED  VALUES  OF  Q  (FOR  VARIANCE  OF  Q) 

C  IP  -  PERMUTATION  COUNTER  (USED  AS  A  CHECK) 

c  PSUM  -  SUMS  THE  NUMBER  OF  PERMUTATIONS  NECESSARY  FOR  EACH  REP 

C  PCT  -  THE  NUMBER  OF  REPS  THE  PERM  TEST  ENDED  EARLY 

C  ISAM  -  INDICATOR  ARRAY  FOR  DIVISION  OF  SAMPLE  IN  PERM  TEST 

C  SEED  -  SEED  FOR  RANDOM  NUMBER  GENERATOR 

C  THETA  1,2  -  ACTUAL  SCALE  PARAMETER  FOR  POPULATION  1,2 

C  MT  1,2  -  ADDITIONAL  SCALE  PARMS  FOR  MIXED  NORMAL 
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PROGRAM  SCALES IM 

REAL*6  SAMPL1(10),SAMPL2(10),PSAMP1(10),PSAMP2(10),COMB(20), 

1  SQDEVl(lO) ,SQDEV2(10) ,OSTAT(10) , PSTAT(IO) , SAMP(IO) , 

2  REJECT(IO) ,REJPER(10) ,W(10) , Z(10) , SAMPl(lO) , SAMP2(10) , 

3  LOC1,LOC2,SCALE1,SCALE2,QVAL1,QVAL2,Q,QHAT,X(10) ,Y(10) , 

4  THETA1 ,  THETA2  ,  MT1 ,  MT2  ,  TM1 ,  TM2  ,  SCI ,  SC2  ,  U ,  C  ,  T ,  DIV , 

5  MEAN1  ,MEAN2  ,MEDI1  ,MEDI2 ,MIDRA1 ,MIDRA2 .RATIO  ,  CVAL,  ALPHA, 

6  SUM  ,  SUM2  , HOLD1 , HOLD2  ,  TSUM ,  TCSUM ,  TMEAN ,  TCMEAN ,  SEED , 

7  QSUM,QSQ,REPS,PSUM,AVEPERM,AVEQ,VARQ 

INTEGER*3  IP, IC,JC,N,M, INT, NREPS, NSTAT, NPERM, ICOMB(20),ISEED, II, S, 

1  INUM,IDEN,NSAM,ISAM(20) ,NSIZE , ISTART, 12 , EXTREM(IO) , PCT, 

2  ICT(10),GS(10) 

LOGICAL*3  CONTIN(10),ALL,ODD 

CHARACTER*15  IDENT,LABEL(10) 

COMMON/PERMCOM/OSTAT , NSTAT , N , M , COMB , INT , NPERM , CVAL 
C 

C  DEFINE  OUTPUT  FORMATS 
C 

1  FORMAT(I5) 

2  FORMAT(8I5) 

3  F0RMAT(1X,A15,I5) 

4  FORMATC  THIS  RUN  INVOLVED  SAMPLING  FROM  THE  ' ,A15 ,' DISTRIBUTION' ) 

5  FORMATC  WITH  ',15,'  REPLICATIONS',/) 

6  FORMAT(2I5,2F10.5) 

7  FORMAT(2I5,4F10.5) 

8  FORMATC  SAMPLE  SIZES  WERE:  ',15,'  AND  ',15) 

9  FORMATC  SCALE  PARAMETERS  WERE:  ',F7.4,'  AND  \F7.4,/) 

10  FORMATC  SCALE  PARAMETERS  FOR  SAMPLE  1:  ',F7.4,'  AND  \F7.4) 

11  FORMATC  AND  FOR  SAMPLE  2:  \F7.4,'  AND  '  '  F7  4  /) 

12  FORMAT (10F8. 4) 

13  FORMATC  THE  VALUE  OF  '  ,A15 ,' STATISTIC  FOR  THE  ORIGINAL  SAMPLE" 
1F10.5) 

14  FORMAT(/,'  THE  PERMUTATION  TEST  ON  THE  ',15, 

l'TH  REPLICATION  WAS  TERMINATED  AFTER  ',15,'   PERMUTATIONS'  /  /) 

15  FORMATC  EXTREMCI1,'):  ',I5,2X,'  REJECTC  ,  II,  ' )  :  \F5.2)' 

16  FORMATC  REJECTION  RATE  FOR  THE  TEST  BASED  ON  \A15,'IS   \F7.5,/) 

17  FORMATC  NPERM: ',15,'   CVAL:  '  ,  F8  .4 ,  '  SEED:  '  ,F8.2,  '  NSTAT:  ',15) 

18  FORMAT (IX, 15,'  TIMES  THE  ADAPTIVE  STATISTIC  USEd' ' , A15 , /) 

19  FORMATC  AVERAGE  NUMBER  OF  PERMUTATIONS:  '  ,  F7  .  2  ,  /  ,  ' 

1       '  THE  PERMUTATION  TEST  ENDED  EARLY  ',15,'  TIMES'  /) 

20  FORMATC  AVERAGE  VALUE  OF  Q:  ',F7,4,'  WITH  VARIANCE-  '  F7  4  /) 
C  '  "' 
C 

C  DEFINE  NUMBER  OF  PERMUTATIONS 

C  AND  NUMBER  OF  STATISTICS  TO  BE  COMPARED 

C  AND  SET  SEED  FOR  RANDOM  NUMBER  GENERATOR 

C 
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NPERM-500 

CVAL  -  0.05*NPERM 

NSTAT-8 

READ(17,1)ISEED 

SEED-FLOAT(ISEED) 
CALL  RANUP(SEED) 
WRITE( 16 , 17 ) NPERM , CVAL , SEED , NSTAT 


c 

C         INITIALIZE  ARRAYS 

DO   50   K-l 

,10 

REJECT(K) 

-  0 

OSTAT(K)  - 

0.0 

PSTAT(K)  - 

0.0 

CONTIN(K) 

-  .TRUE. 

REJPER(K) 

-  0 

SAMPLl(K) 

-  0.0 

PSAMPl(K) 

-  0.0 

SAMPl(K)  - 

0.0 

SAMP(K)  - 

0.0 

SQDEVl(K) 

-  0.0 

X(K)  -  0.0 

Z(K)  -  0.0 

SAMPL2(K) 

-  0.0 

PSAMP2(K) 

-  0.0 

SAMP2(K)  - 

0.0 

SQDEV2(K)  - 

-  0.0 

Y(K)  -  0.0 

W(K)  -  0.0 

50    ICT(K)  -  0 

DO  60  K-l, 20 

ICOMB(K)  -  1 

3 

COMB(K)  -  0 

.0 

60   ISAM(K)  -  0 

c 

PCT  -  0 

PSUM  -  0.0 

QSUM  -  0.0 

QSQ  -  0.0 

LABEL(l)  - 

'THE  MIDRANGE 

LABEL(2)  - 

'MC(.2)' 

LABEL(3)  - 

'MC(.3)' 

LABEL(4)  - 

'THE  MEAN' 

LABEL(5)  - 

•M(.2)' 

LABEL(6)  - 

'M(.3)' 

LABEL(7)  - 

'THE  MEDIAN' 

LABEL(8)  - 

'ADAPTATION' 

READ(15,2)(GS(I), 1-1,8) 
WRITE(16,2)(GS(I), 1-1,8) 
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c 
c 

C  BEGIN  REPLICATION  LOOP 

C 

c 

c 

C  INPUT  THE  SAMPLES 

C 

READ (15, 3)  IDENT, NREPS 

WRITE(16,4)  IDENT 

WRITE (16,  5)  NREPS 
C 

DO  200  J-l, NREPS 

IF  (IDENT  .EQ.  'MIXED  NORM')  GOTO  110 
C 

IF  (J  .GT.  1)   GOTO  105 

READ (15, 6)  N,M,THETA1,THETA2 

WRITE(16,8)  N,M 

WRITE (16, 9)  THETA1 , THETA2 
105  READ(15,12)  (SAMPL1(I),I-1,N) 

READ(15,12)  (SAKPL2(I),I-1,M) 
C*    WRITE(16,12)  (SAMPL1(I),I-1,N) 
C*    WRITE(16,12)  (SAMPL2(I),I-1,M) 

GOTO  120 
C 

110    IF  (J  .GT.  1)   GOTO  115 

READ(15,7)  N,M,THETA1,THETA2,MT1,MT2 

WRITE(16,8)  N,M 

WRITE(16,10)  THETA1.MT1 

WRITE(16,11)  THETA2.MT2 
115  READ(15,12)  (SAMPL1(I),I-1,N) 

READ(15,12)  (SAMPL2(I),I-1,M) 
C*    WRITE(16,12)  (SAMPL1(I),I-1,N) 
C*    WRITE(16,12)  (SAMPL2(I),I-1,M) 
C 

C  COMBINE  AND  SORT  THE  SAMPLES 

C 

120  DO  125  I-l.N 

125  COMB(I)  -  SAMPLl(I) 

DO  130   I-l.M 
130  COMB(I+N)  -  SAMPL2(I) 
C 

CALL  SHELL(COMB,N+M) 
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CALCULATE  THE  STATISTICS 

K-I 

MC(O)  --  MIDRANGE 

K-2 

MC(.2) 

K-3 

MC(.3) 

K-4 

MC(.5)  -  M(0)  --  MEAN 

K-5 

M(.2) 

K-6 

M(.3) 

K-7 

M(.5)  --  MEDIAN 

K-8 

ADAPTIVELY  CHOSEN  TO  BE  ONE  OF  THE  ABOVE 

DO  190  K-l.NSTAT 

GOTO  (135 ,  140 ,  145 ,  150 ,  155  ,  160 ,  165  ,  170) ,  K 


135  CALL  MIDSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 
GOTO  175 
C 

140  ALPHA-. 2 
GOTO  148 
C 

145  ALPHA-. 3 
C 

148  CALL  TCMNSC(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2, ALPHA) 
GOTO  175 
C 

150  CALL  MNSCAL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 
GOTO  175 
C 

155  ALPHA-.  2 
GOTO  162 
C 

160  ALPHA-. 3 
C 

162  CALL  TMNSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2, ALPHA) 
GOTO  175 
C 

165  CALL  MEDSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 
GOTO  175 
C 

170  CALL  QINT(SAMPL1,SAMPL2,N,M,Q,INT) 
OSTAT(K)  -  OSTAT(INT) 
ICT(INT)  -  ICT(INT)  +  1 
QSUM  -  QSUM  +  Q 
QSQ  -  QSQ  +  Q  **  2 
GOTO  190 
C 

175  OSTAT(K)  -  RATI0(SCALE1 , SCALE2) 
C*180  WRITE(16,13)LABEL(K),OSTAT(K) 

190  CONTINUE 
C 
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c 

C         RUN  THE  PERMUTATION  TEST 

C 

C 

CALL  BPERM(ALL,EXTREM  IP) 
C 

PSUM  -  PSUM  +  IP 

IF  (IP  .LT.  NPERM)  PCT  -  PCT  +  1 
C 

IF  (ALL)  GOTO  192 
C 
C*      WRITE(16,14)J,IP 

GOTO  200 
C 

192    DO  195   K-l.NSTAT 

IF  (EXTREM(K)  . LT .  CVAL)  REJECT(K)  -  REJECT(K)  +10 
C*      WRITE(16,15)K,EXTREM(K),K,REJECT(K) 

195    CONTINUE 
C 

200  CONTINUE 
C 

C 

C  END  OF  REPLICATION  LOOP 

C 

C 
C 

C        CALCULATE  SUMMARY  STATISTICS 

C 

C 

REPS  -  FLOAT(NREPS) 
DO  210  K-l.NSTAT 

REJPER(K)  -  REJECT(K)  /  REPS 
WRITE(16,16)  LABEL(K),REJPER(K) 
210  CONTINUE 

DO  220  K-2,6 
220  WRITE(16,18)  ICT(K) ,LABEL(K) 


C 


AVEPERM  -  PSUM  /REPS 

AVEQ  -  QSUM  /  REPS 

VARQ  -  (QSQ  -  (QSUM**2)/REPS)  /  (REPS-1) 

WRITE (16, 19)  AVEPERM, PCT 

WRITE(16,20)  AVEQ, VARQ 

STOP 
END 
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C 

C  SUBROUTINE  BPERM 

C 

C  PURPOSE 

C  TO  PERFORM  AN  APPROXIMATE  PERMUTATION  TEST  BY  SAMPLING 

C  1000  TIMES  FROM  THE  SET  OF  ALL  POSSIBLE  PERMUTATIONS 

C 

C  USAGE 

C  CALL  BPERM(ALL,EXTREM,IP) 

C 

c 

C    DESCRIPTON  OF  PARAMETERS 

C       ALL     -   LOGICAL  MARKER  SIGNIFYING  AN  ABORTED  PERMUTATION 

C  LOOP  MEANING  P-VALUE  FOR  ALL  TESTS  GREATER  THAN  .05 

C        EXTREM  -   VECTOR  COUNTING  EXTREM  VALUES  OF  THE  STATISTICS 

C 

C    SUBROUTINES  CALLED 

C       SAMPER 

C 

C 

SUBROUTINE  BPERM(ALL, EXTREM, IP) 

REAL*6  OSTAT(10),PSTAT(10),COMB(20),CVAL,PSAMP1(10),PSAMP2(10) 
INTEGER*3  IC.JC ,N,M, NSTAT, IP, NPERM, INT, ISAM(20) , EXTREM(IO) 
LOGICAL*3  ALL.CONTIN(IO) 

COMMON/PERMCOM/OSTAT , NSTAT , N , M , COMB , INT , NPERM , CVAL 
C 

DO  100  K-l, NSTAT 
CONTIN(K)  -  .TRUE. 
PSTAT(K)  -  0.0 
100   EXTREM(K)  -  0 
C 

IP-0 
DO  200   1-1, NPERM 
IP  -  IP  +  1 
C 

CALL  SAMPER(ISAM,N,M) 
C 

IC-1 
JC-1 
DO  110   L-l.N+M 

IF  (ISAM(L)  .EQ.  1)  THEN 

PSAMPl(IC)  -  COMB(L) 
IC  -  IC  +  1 


ELSE 


PSAMP2(JC)  -  COMB(L) 
JC  -  JC  +  1 


END  IF 
110    CONTINUE 
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c 

c 

CALCULATE  THE  STATISTICS 

c 

K-l  :  MC(O)  --  MIDRANGE 

c 

K-2  :  MC(.2) 

c 

K-3  :  MC(.3) 

c 

K-4  :  MC(.5)  -  M(0)  --  MEAN 

c 

K-5  :  M(.2) 

c 

K-6  :  M(.3) 

c 

K-7  :  M(.5)  --  MEDIAN 

c 

K-8  :  ADAPTIVELY  CHOSEN  TO  BE  ONE  OF  THE  ABOVE 

c 

c 

DO  185   K-l.NSTAT 

IF  (.NOT.  CONTIN(K))  THEN 

GOTO  185 
ELSE 

GOTO  (120 ,125,130, 140 ,  145  ,  150 ,  160 ,  165)  ,K 
END  IF 
C 

120         CALL  MIDSCL(PSAMP1,PSAMP2,N,M,SCALE1,SCALE2) 
GOTO  170 
C 

125         ALPHA-.  2 
GOTO  135 
C 

130         ALPHA-.  3 
C 

135  CALL  TCMNSC(PSAMP1,PSAMP2,N,M,SCALE1,SCALE2, ALPHA) 

GOTO  170 
C 

140         CALL  MNSCAL(PSAMP1,PSAMP2,N,M,SCALE1,SCALE2) 
GOTO  170 
C 

145         ALPHA- .  2 
GOTO  155 
C 

150        ALPHA-. 3 
C 

155  CALL  TMNSCL(PSAMP1,PSAMP2,N,M,SCALE1,SCALE2, ALPHA) 

GOTO  170 
C 

160         CALL  MEDSCL(PSAMP1,PSAMP2,N,M,SCALE1,SCALE2) 
GOTO  170 
C 

165         PSTAT(K)  -  PSTAT(INT) 
GOTO  185 
C 

170       PSTAT(K)  -  RATIO (SCALE1 , SCALE2) 
185   CONTINUE 
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ALL  -  .FALSE. 
DO  190  K-l.NSTAT 

IF  (.NOT.  CONTIN(K))  THEN 

GOTO  190 
ELSE 

IF  (PSTAT(K)  .GT.  OSTAT(K))  EXTREM(K)  -  EXTREM(K)  +  1 
IF  (EXTREM(K)  .GT.  CVAL)  CONTIN(K)  -  .FALSE. 
IF  (CONTIN(K))  ALL  -  .TRUE. 
END  IF 
190    CONTINUE 
C 

C*      WRITE(16,191)  ALL 

C*191   FORMATC  THE  VALUE  OF  ALL  IS:  ',12) 
IF  (.NOT.  ALL)  GOTO  210 
200  CONTINUE 
210  RETURN 
END 
C 
C 

C**************************************************************** 
C 

C   SUBROUTINE  DEVSQ 
C    PURPOSE 

C      SUBTRACT  A  QUANTITY  FROM  THE  SAMPLE  VECTOR  AND  SQUARE 
C      THOSE  DEVIATIONS 
C 
C    USAGE 

C      CALL  DEVSQ(SAMPL1,SAMPL2,N,M,L0C1,L0C2,SQDEV1,SQDEV2) 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1  (2)  -   REAL*6  ARRAY  OF  SIZE  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C      LOCI  (2)     -   LOCATION  ESTIMATES  FOR  SAMPLE  1  (2) 

C      SQDEV1  (2)   -   THE  SQARED  DEVIATION  FOR  SAMPLE  1  (2) 

C 

SUBROUTINE  DEVSQ (X, Y,N,M,TM1 ,TM2 , Z,W) 

REAL*6  X(N),Y(M),TM1,TM2,Z(N),W(M) 

INTEGER*3  N,M 
C 

DO  100   I-l.N 
100   Z(I)  -  (X(I)  -  TM1)  **  2 


C 


DO  110   I-l.M 
110  U(I)  -  (Y(I)  -  TM2)  **  2 

RETURN 
END 
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C 

C   SUBROUTINE  MEAN 

C 

C    PURPOSE 

C      CALCULATES  THE  SAMPLE  MEAN  FOR  EACH  OF  TWO  SAMPLES 

C 

C    USAGE 

C      CALL  MEAN(SAMPL1,SAMPL2,N,M,L0C1,L0C2) 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING  THE 

C  SAMPLE  VALUES 

C      LOCI  (2)    -   ESTIMATE  OF  THE  LOCATION  PARAMETER  (THE  MEAN) 

C  FOR  SAMPLE  1  (2) 

C 

C 

SUBROUTINE  MEAN(X, Y,N,M,MEAN1  ,MEAN2) 

REAL*6  X(N),Y(M),SUM1,SUM2,MEAN1,MEAN2 

INTEGER*3  N,M 
C 

SUM1-0.0 

SUM2-0.0 
C 

DO  100   1-1,11 
100  SUM1  -  SUM1  +  X(I) 

MEAN1  -  SUM1  /  FLOAT  (N) 
C 

DO  120  I-l.M 
120  SUM2  -  SUM2  +  Y(I) 

MEAN2  -  SUM2  /  FLOAT  (M) 
C 

RETURN 

END 
C 
C 

c 

C   SUBROUTINE  MNSCAL 
C    PURPOSE 

CALCULATES  AN  ESTIMATE  OF  SCALE  BASED  ON  THE  MEAN  FOR  EACH 
C      OF  TWO  SAMPLES 
C 
C     USAGE 

C  CALL  MNSCAL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 

C 

C  SUBROUTINES  CALLED 

C  MEAN.DEVSQ 
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C 


c 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C      LOCI  (2)      -   ESTIMATE  OF  LOCATION  BASED  ON  SAMPLE  1  (2) 

C      SCALE1  (2)    -   RETURNED  VALUE  OF  THE  ESTIMATE  OF  SCALE 

0  FROM  SAMPLE  1  (2) 

C 

C 

SUBROUTINE  MNSCAL(SAMP1 ,  SAMP2  ,N,M,  SCALE1 ,  SCALE2) 
REAL*6  SAMP1(10),SAMP2(10),LOC1,LOC2,SCALE1,SCALE2, 
1       SQDEV1(10),SQDEV2(10),SC1,SC2 
INTEGER*3  N,M 
C 

CALL  MEAN(SAMP1,SAMP2,N,M,L0C1,L0C2) 
C 

CALL  DEVSCK  SAMP1 , SAMP2 , N , M , LOCI , L0C2 , SQDEV1 , SQDEV2 ) 
C 

CALL  MEAN(SQDEV1,SQDEV2,N,M,SC1,SC2) 
C 

SCALE1-SC1 

SCALE2-SC2 
C 

RETURN 

END 
C 
C 

C**************************************************************** 
C 

C   SUBROUTINE  MEDIAN 
C 

C    PURPOSE 

C      CALCULATES  THE  SAMPLE  MEDIAN  FOR  EACH  OF  TWO  SAMPLES 
C 

C    USAGE 

C      CALL  MEDIAN(SAMPL1,SAMPL2,N,M,L0C1,L0C2) 
C 

C    SUBROUTINES  CALLED 
C      SHELL 
C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING  THE 
C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C      LOCI  (2)    -   ESTIMATE  OF  THE  LOCATION  PARAMETER  (THE  MEDI) 
C  FOR  EACH  SAMPLE 

C 
C 
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c 


SUBROUTINE  MEDIAN(X,Y,N,M,MEDI1  ,MEDI2) 
REAL*6  X(N),Y(M),MEDI1,MEDI2 
INTEGER*3  N,M 
LOGICAL  ODD 

MEDI1-0.0 
MEDI2-0.0 

CALL  SHELL(X.N) 

ODD-. FALSE. 

IF  (MOD(N,2)  .NE.  0  . 0 ) ODD- . TRUE . 

IF  (ODD)  THEN 

MEDI1  -  X(  (N+l)/2  ) 
ELSE 

MEDI1  -  (  X(  N/2  )   +  X(  N/2  +  1)  )  /  2. 
ENDIF 


CALL  SHELL  (Y,M) 

ODD-. FALSE. 

IF  (MOD(M,2)  .NE.  0 . 0)ODD-. TRUE. 

IF  (ODD)  THEN 

MEDI2  -  Y(  (M+l)/2  ) 
ELSE 

END  IF 


MEDI2  -  (  Y(  M/2  )   +  Y(  M/2  +  1)  )  /  2. 


RETURN 
END 


C 

C  SUBROUTINE  MEDSCL 

C  PURPOSE 

CALCULATES  AN  ESTIMATE  OF  SCALE  BASED  ON  THE  MEDIAN  FOR  EACH 

C  OF  TWO  SAMPLES 
C 

C  USAGE 

C  CALL  MEDSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 

C 

C  SUBROUTINES  CALLED 

C  MEDIAN,  DEVSQ 

C 
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C  DESCRIPTION  OF  PARAMETERS 

C  SAMPL1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C  LOCI  (2)     -   ESTIMATE  OF  LOCATION  BASED  ON  SAMPLE  I  (2) 

C  SCALE1  (2    -   RETURNED  VALUE  OF  THE  ESTIMATE  OF  SCALE 

C  FROM  SAMPLE  1  (2) 


SUBROUTINE  MEDSCL( SAMP1 , SAMP2 , N , M , SCALE1 , SCALE2 ) 
REAL*6  SAMP1 ( 10) , SAMP2 (10) , LOCI , LOC2 , SCALE1 , SCALE2 , 
1       SQDEV1(10),SQDEV2(10),SC1,SC2 
INTEGER*3  N,M 

CALL  MEDIAN(SAMP1,SAMP2,N,M,L0C1,L0C2) 

CALL  DEVSQ ( SAMP1 , S AMP2 , N , M , LOCI , LOC2 , SQDEV1 , SQDEV2 ) 

CALL  MEDIAN(SQDEV1,SQDEV2,N,M,SC1,SC2) 

SCALE1-SC1 

SCALE2-SC2 
C 

RETURN 

END 
C 
C 

C  SUBROUTINE  MIDRAN 

C 

C  PURPOSE 

C  CALCULATES  THE  MIDRANGE  FOR  EACH  OF  TWO  SAMPLES 

C 

C  USAGE 

C      CALL  MIDRAN(SAMPL1,SAMPL2,N,M,L0C1  LOC2) 

C 

C 

C    SUBROUTINES  CALLED 

C      SHELL 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING  THE 

C  SAMPLE  VALUES 

LOCI  (2)    -   ESTIMATE  OF  THE  LOCATION  PARAMETER  (MIDRANGE) 
c  FOR  EACH  SAMPLE 

C 
C 
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SUBROUTINE  MIDRAN  (X ,  Y ,  N ,  M ,  MIDRA1 ,  MIDRA2  ) 
REAL*6  X(N),Y(M) ,MIDRA1 ,MIDRA2 
INTEGER*3  N,M 

MIDRA1  -  0.0 
MIDRA2  -  0.0 

CALL  SHELL(X.N) 

MIDRA1  -  (  X(l)  +  X(N)  )  /  2.0 

CALL  SHELL(Y.M) 

MIDRA2  -  (  Y(l)  +  Y(M)  )  /  2.0 

RETURN 
END 


C 

C   SUBROUTINE  MIDSCL 

C 

C     PURPOSE 

C       CALCULATES  AN  ESTIMATE  OF  SCALE  BASED  ON  THE  MIDRANGE  FOR 

C      EACH  OF  TWO  SAMPLES 

C 

C    USAGE 

C      CALL  MIDSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2) 

C 

C    SUBROUTINES  CALLED 

C      MIDRAN, DEVSQ 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C      LOCI  (2)     -   ESTIMATE  OF  LOCATION  BASED  ON  SAMPLE  1  (2) 

C      SCALE1  (2)   -   RETURNED  VALUE  OF  THE  ESTIMATE  OF  SCALE 

C  FROM  SAMPLE  1  (2) 

C 

C 

SUBROUTINE  MIDSCL(SAMP1 , SAMP2 ,N,M, SCALE1 , SCALE2) 
REAL*6  SAMP1(10),SAMP2(10),LOC1,LOC2,SCALE1,SCALE2, 
1       SQDEV1(10),SQDEV2(10) , SCI , SC2 
INTEGER*3  N,M 


CALL  MIDRAN(SAMP1,SAMP2,N,M,L0C1,L0C2) 

CALL  DEVSQ ( SAMP1 , SAMP2 , N , M , LOCI , L0C2 , SQDEV1 , SQDEV2 ) 

CALL  MIDRAN(SQDEV1,SQDEV2,N,M,SC1,SC2) 
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SCALE1-SC1 
SCALE2-SC2 


C 


RETURN 

END 
C 
C 

C 

C   FUNCTION  QHAT 

C 

C    PURPOSE 

C      CALCULATES  Q,  THE  NONNORMALITY  INDICATOR  BY  WHICH  ALPHA  IS 

C      DETERMINED  ADAPTIVELY  (SEE  HOGG   1974) 

C 

C     USAGE 

C       QVAL  -  QHAT ( SAMP, N) 

C 

C    SUBROUTINES  CALLED 

C      SHELL 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP   -   REAL*6  ARRAY  OF  SIZE  N  CONTAINING  SAMPLE  VALUES 

C  FROM  A  POPULATION 

C 

C 

FUNCTION  QHAT(X.N) 

REAL*6  X(N),HOLDl,HOLD2,QHAT 

INTEGER*3  N.INUM.IDEN 
C 

CALL  SHELL(X.N) 


INUM  -  0.05*N 

IDEN  -  0.5*N 

HOLD1  -  0.0 

HOLD2  -  0.0 
C 

IF  (INUM  .LT.  1)  GOTO  110 

DO  100   I-l.INUM 
100  HOLD1  -  HOLDl  +  X(N+1-I)  -  X(I) 
110  HOLDl  -  HOLDl  +  (.05*N  -  INUM)  *  (  X(N-INUM)  -  X(INUM+1)  ) 

HOLDl  -  HOLD1/(.05*N) 
C 

DO  120   I-l.IDEN 
120  HOLD2  -  HOLD2  +  X(N+1-I)  -  X(I) 

HOLD2  -  HOLD2/(.5*N) 

IF  (HOLD2  .LT.  0.000001)  HOLD2-0 . 000001 
C 

QHAT  -  HOLD1/HOLD2 

RETURN 

END 
C 

A3 


c 

C   SUBROUTINE  QINT 

C 

C    PURPOSE 

C      DETERMINES  THE  INTERVAL  IN  WHICH  Q  IS  OBSERVED  IN  ORDER 

C      TO  CHOOSE  THE  BEST  TRIMMED  MEAN  AS  SUGGESTED  BY  PRESCOTT 

C      (SEE  BOYER  AND  KOLSON,  1983) 

C 

C    USAGE 

C      CALL  QINT ( S AMPL1 , SAMPL2 , N , M , Q , INT) 

C 

C    FUNCTIONS  USED 

C      QHAT 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP1  (2)  -   REAL*6  ARRAY  OF  SIZE  N  CONTAINING  SAMPLE  VALUES 

0      Q         -   THE  ESTIMATED  VALUE  OF  HOGG'S  Q  STATISTIC 

C      INT  THE  INTERVAL  (2,6)  WHEREIN  QHAT  LIES 

C 

C 

SUBROUTINE  QINT(X,  Y,N,M,Q,  INT) 
REAL*6  X(N),Y(M),Q,QVAL1,QVAL2 
INTEGER*3  N.M.INT 
C 

QVAL1  -  QHAT(X.N) 
QVAL2  -  QHAT(Y.M) 
Q  -  (QVAL1  +  QVAL2)  /  2.0 
C 

IF  (  Q  .LI.  2.2  )  THEN 

INT  -  2 
ELSE 

IF  (  Q  .LT.  2.4)  THEN 

INT  -  3 
ELSE 

IF  (  Q  .LE.  2.8)  THEN 

INT  -  4 
ELSE 

IF  (  Q  .LE.  3.0)  THEN 

INT  -  5 
ELSE 

INT  -  6 
END  IF 
END  IF 
END  IF 
END  IF 
C 

C*     WRITE(16,100)Q,INT 

C*100  FORMATC  THE  VALUE  OF  Q  IS  \F7.5,'  PLACED  IN  INTERVAL  '  12) 
C 

RETURN 
END 
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c* 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


*********************************  ****************************** 


FUNCTION  RATIO 

PURPOSE 

CALCULATE  THE  RATIO  OF  TWO  STATISTICS 

USAGE 

STAT  -  RATIO  (SCALE1.SCALE2) 

DESCRIPTION  OF  PARAMETERS 

SCALE1   -   SCALE  ESTIMATE  OF  A  SAMPLE  FROM  A  POPULATION 
HAVING  SMALLER  ACTUAL  SCALE 


SCALE2 


SCALE  ESTIMATE  OF  A  SAMPLE  FROM  A  POPULATION 
HAVING  LARGER  ACTUAL  SCALE 


FUNCTION  RATIO ( SCI, SC2) 
REAL*6  SCI, SC2, RATIO 

IF  (SCI  .LT.  O.O0OO1)  SCI  -  0.00001 
RATIO  -  SQRT(SC2)  /  SQRT(SCl) 

RETURN 
END 


C 
C 
C 

c 

C 

c 

c 
c 

C 

c 

c 

C 
C 

c 
c 
c 
c 
c 
c 
c 
c 
c 
c 


SUBROUTINE  SAMPER 

PURPOSE 

SAMPLE  AN  ELEMENT  RANDOMLY  FROM  THE  SET  OF  ALL  POSSIBLE 
PERMUTATIONS 

USAGE 

CALL  SAMPER(ISAM,N,M) 

FUNCTION  CALLED 
RANU 

DESCRIPTION  OF  PARAMETERS 

ISAM   -   RETURNED  INDICATOR  ARRAY  OF  LENGTH  N+M 
N,M    -   SAMPLE  SIZES 

METHOD 

THE  ARRAY  ISAM  IS  USED  TO  INDICATE  THE  ELEMENTS  OF  THE 
COMBINED  SAMPLE  THAT  WILL  BE  ASSIGNED  TO  SAMPLE  1 
(INDICATOR-1)  OR  SAMPLE  2  (INDICATOR-0)  FOR  THE  RANDOMLY 
SELECTED  PERMUTATION.   THE  ELEMENTS  OF  ISAM  ARE  INITIALIZED 
TO  0  AND  TURNED  TO  1  BY  RANDOM  SAMPLING  WITHOUT  REPLACEMENT 
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SUBROUTINE  SAMPER(ISAM,N,M) 

INTEGER*3  N,M,I,NSAM,ISAM(N+M) 

REAL*6  U,C 
C 

DO  100   L-l, N+M 
100  ISAM(L)-0 

C-FLOAT(N+M) 

NSAM-0 
C 

150  U  -  RANU(0. 0,1.0) 

I-INT(U*C)  +  1 

IF  (ISAM(I)  .EQ.  1)  GOTO  150 

ISAM(I)-1 

NSAM-NSAM+1 

IF  (NSAM  .LT.  N)  GOTO  150 
C 

RETURN 

END 
C 
C 

C 

C   SUBROUTINE  SHELL 

C 

C    PURPOSE 

C      SORT  A  SET  OF  DATA  INTO  ASCENDING  ORDER 

C 

C    USAGE 

C      CALL  SHELL(SAMP.NSIZE) 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP    -   ARRAY  OF  SAMPLE  DATA  TO  BE  SORTED 

C      NSIZE   -   SIZE  OF  SAMPLE 

C 

C    METHOD 

C      SHELL  SORT  TECHNIQUE 

C 

SUBROUTINE  SHELL(SAMP, NSIZE) 

REAL*6  SAMP(NSIZE) ,T 

INTEGER*3  S, NSIZE 
C 

S-NSIZE 
100  S-INT(S/2) 

IF  (S  .LT.  1)G0T0  150 
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DO  140  K-l.S 
DO  130   I-K,NSIZE-S,S 
J-I 

T-SAMP(I+S) 
110   IF  (T  .GE.  SAMP(J))  GOTO  120 
SAMP(J+S)-SAMP(J) 
J-J-S 

IF  (J  .GE.  1)  GOTO  110 
120    SAMP(J+S)-T 
130   CONTINUE 
140  CONTINUE 
GOTO  100 
C 

150  RETURN 
END 
C 
C 

0***************************************4*****4****************** 

c 

C   FUNCTION  TCMEAN 

C 

C    PURPOSE 

C      CALCULATES  THE  MEAN  OF  THE  TRIMMINGS  DEFINED  BY  ALPHA 

C 

C    USAGE 

C      STAT  -  TCMEAN ( SAMP, N, ALPHA) 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMP    -   REAL*6  ARRAY  OF  SIZE  N  CONTAINING  THE  SAMPLE 

C  VALUES  FROM  A  POPULATION 

C      ALPHA   -   THE  PERCENT  OF  TRIMMING  DESIRED 

C 

C 

FUNCTION  TCMEAN(X,N,A) 

REAL*6  X(N) , A, TCSUM, TCMEAN, DIV 

INTEGER*3  N,I1,I2,ISTART 
C 

CALL  SHELL(X.N) 
C 

IF  (A  .LT.  ,00001)A-. 00001 

DIV  -  2.  *  N  *  A 

ISTART  -  N  *  A 

TCSUM  -  0.0 

IF  (ISTART  .LT.  1)  GOTO  110 

DO  100   1-1,  ISTART 
100  TCSUM  -  TCSUM  +  X(N+1-I)  +  X(I) 
110  TCSUM  -  TCSUM  +  (N*A  -  ISTART)  *  (  X(ISTART+1)  +  X(N-ISTART)  ) 

TCMEAN  -  TCSUM  /  DIV 


RETURN 
END 
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c 
c 

C  SUBROUTINE  TCMNSC 

C 

C  PURPOSE 

C  CALCULATES  AN  ESTIMATE  OF  SCALE  BASED  ON  THE  DESIGNATED 

C  MEAN  OF  TRIMMINGS  FOR  EACH  OF  TWO  SAMPLES 

C 

C  USAGE 

C      CALL  TCMNSC(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2, ALPHA) 
C 

C    SUBROUTINES/FUNCTIONS  CALLED 

C      TCMEAN , DEVSQ 

C 

C    DESCRIPTION  OF  PARAMETERS 

C      SAMPL1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C      LOCI  (2)      -   ESTIMATE  OF  LOCATION  BASED  ON  SAMPLE  1  (2) 

C      SCALE1  (2)    -   RETURNED  VALUE  OF  THE  ESTIMATE  OF  SCALE 

C  FROM  SAMPLE  1  (2) 

C      ALPHA        -   THE  AMOUNT  OF  TRIMMING  REQUESTED 

C 

C 

SUBROUTINE  TCMNSC (SAMP1 ,  SAMP2  ,N,M,  SCALE1 ,  SCALE2  ,  A) 

REAL*6  SAMP1 (10) , SAMP2 (10) , LOCI , LOC2 , SCALE1 , SCALE2 , 
1       SQDEV1(10),SQDEV2(10),A 

INTEGER*3  N,M 
C 

LOCI  -  TCMEAN (SAMP1.N, A) 

LOC2  -  TCMEAN (SAMP2.M, A) 
C 

CALL  DEVSQ( SAMP1 , SAMP2 , N , M , LOCI , LOC2 , SQDEV1 , SQDEV2 ) 
C 

SCALE1  -  TCMEAN (SQDEV1.N, A) 

SCALE2  -  TCMEAN(SQDEV2,M,A) 
C 

RETURN 

END 
C 
C 

C 

C   FUNTION  TMEAN 

C 

C    PURPOSE 

C      CALCULATES  THE  ALPHA  TRIMMED  MEAN  FROM  A  SAMPLE 

C 

C    USAGE 

C      LOC  -  TMEAN (S AMP, N, ALPHA) 

C 
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C     DESCRIPTION  OF  PARAMETERS 

C      SAMP    -   REAL*6  ARRAY  OF  SIZE  N  CONTAINING  THE  SAMPLE 

C  VALUES  FROM  A  POPULATION 

C      ALPHA   -   THE  PERCENT  OF  TRIMMING  DESIRED 

C 

C 

FUNCTION  TMEAN(X,N,A) 

REAL*6  X(N) ,A,TSUM,TMEAN,DIV 

INTEGER*3  N,I1,I2,ISTART 
C 

CALL  SHELL(X.N) 
C 

IF  (  A  .GT.  .499999)A-. 499999 

DIV  -  N  -  2.0*N*A 

I START  -  N*A 

11  -  ISTART  +  2 

12  -  N  -  ISTART  -  1 
TSUM  -  0.0 

IF  (II  .GT.  12)  GOTO  110 

DO  100   I  -  11,12 
100  TSUM  -  TSUM  +  X(I) 
110  TSUM  -  TSUM  +  (1.0  +  ISTART  -  N*A  )  *  (  X(ISTART+1)  +  X(I2+1)  ) 

TMEAN  -  TSUM  /  DIV 
C 

RETURN 

END 
C 
C 

C*************************************************************i4* 
C 

C   SUBROUTINE  TMNSCL 
C    PURPOSE 

C      CALCULATES  AN  ESTIMATE  OF  SCALE  BASED  ON  THE  DESIGNATED 
C      TRIMMED  MEAN  FOR  EACH  OF  TWO  SAMPLES 
C 
C    USAGE 

C      CALL  TMNSCL(SAMPL1,SAMPL2,N,M,SCALE1,SCALE2, ALPHA) 
C 

C  SUBROUTINES/FUNCTIONS  CALLED 

C  TMEAN, DEVSQ 

C 

C  DESCRIPTION  OF  PARAMETERS 

C  SAMPL1  (2)   -   REAL*6  ARRAY  OF  LENGTH  N  (M)  CONTAINING 

C  SAMPLE  VALUES  FROM  POPULATION  1  (2) 

C  LOCI  (2)     -   ESTIMATE  OF  LOCATION  BASED  ON  SAMPLE  1  (2) 

C  SCALE1  (2)   -   RETURNED  VALUE  OF  THE  ESTIMATE  OF  SCALE 

C  FROM  SAMPLE  1  (2) 

-   THE  AMOUNT  OF  TRIMMING  REQUESTED 


C      ALPHA 

C 

C 
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SUBROUTINE  TMNSCL(SAMP1 , SAMP2 ,N,M, SCALE1 , SCALE2 , A) 
REAL*6  SAMPK 10)  ,  SAMP2  ( 10)  ,  LOCI ,  LOC2  ,  SCALE1 ,  SCALE2  , 
1       SQDEV1(10),SQDEV2(10),A 
INTEGER*3  N,M 

LOCI  -  TMEAN(SAMP1,N,A) 
LOC2  -  TMEAN(SAMP2,M,A) 

CALL  DEVSQ( SAMP1 , SAMP2 , N , M , LOCI , LOC2 , SQDEV1 , SQDEV2 ) 


SCALE1  -  TMEAN(SQDEV1,N,A) 

SCALE2  -  TMEAN(SQDEV2,M,A) 
C 

RETURN 

END 
C 
C 
C-i*******************************************************^^.^^.-;. 
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Appendix  2 

Listing  of  Simulation  Results 

Power  Tables  and  Figures 

page 
Tables 

Uniform  Distribution 52 

Prescott(  .25)  Distribution 53 

Normal  Distribution 54 

Prescott(  .75)  Distribution 55 

Double  Exponential  Distribution 56 

10%  Mixed  Normal  Distribution 57 

Cauchy  Distribution 58 

Figures 

Uniform  Distribution 59 

Prescott(  .  25)  Distribution 60 

Normal  Distribution 61 

Prescott(  .75)  Distribution 62 

Double  Exponential  Distribution 63 

10%  Mixed  Normal  Distribution 64 

Cauchy  Distribution 65 

Legend 

Test  Statistic  Plot  Character 

m  (0.0)  diamond 

m  (0.5)  square 

m  (0.5)  triangle 

adaptive  star 
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TABLE  A-l 

Simulation  Results 
.05  Rejection  Rates 

Uniform  Distribution 


m  (0.0) 

mc(0.2) 

mC(0.3) 

m°(0.5) 
m  (0.2) 
n  (0.3) 
m  (0.5) 

adaptive  0.048        0.494         0.816         0.974 


»-l 

0-1.5 

6-1 

9-3 

0.048 

0.501 

0.824 

0.973 

0.048 

0.494 

0.814 

0.973 

0.055 

0.489 

0.799 

0.969 

0.047 

0.436 

0.756 

0.963 

0.050 

0.295 

0.538 

0.818 

0.045 

0.243 

0.452 

0.736 

0.046 

0.205 

0.395 

0.630 
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TABLE  A- 2 

Simulation  Results 
.05  Rejection  Rates 

Prescott(.25)  Distribution 


9-1 

0-2 

9-3 

9-4 

mc(0.0) 

0.052 

0.663 

0.933 

0.981 

mc(0.2) 

0.054 

0.658 

0.933 

0.981 

mc(0.3) 

0.052 

0.674 

0.936 

0.980 

mc(0.5) 

0.050 

0.682 

0.931 

0.975 

m   (0.2) 

0.051 

0.494 

0.817 

0.911 

m   (0.3) 

0.045 

0.436 

0.731 

0.867 

m   (0.5) 

0.047 

0.359 

0.631 

0.780 

adaptive  0.056        0.669         0.936        0.979 
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TABLE  A- 3 

Simulation  Results 
.05  Rejection  Rates 

Normal  Distribution 


9-1  0-2  0-3  0-4 

mC(0.0)  0.045  0.533  0.833  0.940 

m°(0.2)  0.047  0.530  0.829  0.941 

m°(0.3)  0.042  0.544  0.854  0.956 

mc(0.5)  0.047  0.569  0.871  0.958 

m   (0.2)  0.049  0.463  0.758  0.889 

m   (0.3)  0.052  0.410  0.699  0.835 

m   (0.5)  0.045  0.351  0.602  0.764 

adaptive  0.051  0.546  0.857  0.955 
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TABLE  A- 4 

Simulation  Results 
.05  Rejection  Rates 

Prescott(.75)  Distribution 

0-1          6-2  0-3  0-4 

■»   (0-0)                            0.052                     0.399  0.697  0.877 

mC(0.2)                           0.049                     0.397  0.696  0.869 

m   (0.3)                           0.049                     0.430  0.715  0.895 

m   (0-5)                            0.054                    0.450  0.739  0.907 

m   (0.2)                           0.060                     0.410  0.658  0.851 

m   (°-3)                            0.056                     0.377  0.597  0.806 

m   (0.5)                           0.048                     0.327  0.546  0.733 

adaptive           0.055         0.444  0.737  0.908 
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TABLE  A- 5 

Simulation  Results 
.05  Rejection  Rates 

Double  Exponential  Distribution 


0-1  8-2                           0-4  6-6 

mC(0.0)  0.044  0.326  0.793  0.911 

n>°(0.2)  0.045  0.320  0.784  0.914 

mC(0.3)  0.046  0.342  0.807  0.928 

mC(0.5)  0.048  0.359  0.826  0.945 

m   (0.2)  0.057  0.327  0.780  0.924 

111   (0.3)  0.054  0.295  0.750  0.900 

m   (°-5)  0.052  0.271  0.694  0.847 

adaptive  0.059  0.365  0.823  0.948 
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TABLE  A- 6 

Simulation  Results 
.05  Rejection  Rates 

Mixed  Normal  Distribution 


9-1  6-2                             0-4           8-6 

m°(0.0)                            0.048  0.297  0.555  0.706 

mc(0.2)                           0.047  0.295  0.552  0.716 

m   (0.3)                            0.047  0.302  0.572  0.746 

m°(0.5)  0.047  0.320  0.595  0.763 

m   (0-2)  0.042  0.336  0.778  0.909 

m   (0-3)  0.046  0.313  0.749  0.881 

">   (0.5)  0.056  0.287  0.683  0.836 

adaptive  0.055        0.364         0.719        0.861 

Note:  Population  1  was  N(0,1)  contaminated  with  10%  N(0,64). 
If,  for  example,  I   /   8      -3,  then  Population  2  was  N(0,9) 

contaminated  with  10%  N(0,576). 
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TABLE  A- 7 

Simulation  Results 
.05  Rejection  Rates 

Cauchy  Distribution 


0-1  0-3  0-5  9-8 

m°(0.0)  0.052  0.318  0.478  0.618 

mC(0.2)  0.049  0.315  0.479  0.621 

m°(0.3)  0.049  0.333  0.498  0.642 

»C(0.5)  0.048  0.350  0.520  0.658 

m   (0.2)  0.038  0.446  0.687  0.846 

">   (0-3)  0.039  0.455  0.693  0.849 

m   (0.5)  0.037  0.417  0.674  0.834 

adaptive  0.055  0.447  0.691  0.850 
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FIGURE  B-l:  UNIFORM  DISTRIBUTION 
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FIGURE  B-2:  PRESC0TT(.25)  DISTRIBUTION 
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FIGURE  B-3:  NORMAL  DISTRIBUTION 


•OS  REJECTION   RATE 
l.OJ, 


0.0 


'   0   °     m   (0.0) 
AAA     m(0.5) 


D  □  □     m   (0.5) 
*  *  *     adaptive 


61 


FIGURE  B-4:  PRESC0TT(.75)  DISTRIBUTION 
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FIGURE  B-5:  DOUBLE  EXPONENTIAL  DISTRIBUTION 
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FIGURE  B-6:  MIXED  NORMAL  DISTRIBUTION 
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FIGURE  B-7:  CAUCHY  DISTRIBUTION 
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ABSTRACT 

When  testing  for  equality  of  scale  in  two  populations,  the 
usual  F  test  has  been  shown  to  have  undesirable  properties  when  the 
populations  in  question  are  heavy  tailed.   The  test  is  low  in  power, 
but  even  worse,  the  demonstrated  power  of  the  F  test  cannot  be 
trusted  since  it  fails  to  retain  the  .05  level  when  testing  at  the 
null  hypothesis.   This  report  details  a  study  of  alternative  tests 
for  this  two  sample  scale  problem.   Specifically  chosen  for  study 
were  seven  symmetric  populations  which  vary  in  tailweight.   The 
power  of  eight  test  statistics  based  on  functions  of  trimmed  means 
(the  average  of  a  specified  portion  of  the  sample)  are  compared  via 
permutation  tests.   Of  special  interest  is  an  adaptive  test 
procedure,  which  first  estimates  the  tailweight  of  the  population, 
then,  based  on  that  estimate,  chooses  the  amount  of  trimming  used  in 
the  test  statistic.   This  procedure  is  shown  to  be  the  most 
consistently  powerful  of  the  tests  studied  here. 


